Lyapunov Design for Safe Reinforcement Learning

نویسندگان

  • Theodore J. Perkins
  • Andrew G. Barto
چکیده

Lyapunov design methods are used widely in control engineering to design controllers that achieve qualitative objectives, such as stabilizing a system or maintaining a system’s state in a desired operating range. We propose a method for constructing safe, reliable reinforcement learning agents based on Lyapunov design principles. In our approach, an agent learns to control a system by switching among a number of given, base-level controllers. These controllers are designed using Lyapunov domain knowledge so that any switching policy is safe and enjoys basic performance guarantees. Our approach thus ensures qualitatively satisfactory agent behavior for virtually any reinforcement learning algorithm and at all times, including while the agent is learning and taking exploratory actions. We demonstrate the process of designing safe agents for four different control problems. In simulation experiments, we find that our theoretically motivated designs also enjoy a number of practical benefits, including reasonable performance initially and throughout learning, and accelerated learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lyapunov Design for Safe Reinforcement Learning Control

We propose a general approach to safe reinforcement learning control based on Lyapunov design methods. In our approach, a Lyapunov function—a special form of domain knowledge—is used to formulate the action choices available to a reinforcement learning agent. A learning agent choosing among these actions provably enjoys performance guarantees, and satisfies safety constraints of various kinds. ...

متن کامل

Safe Model-based Reinforcement Learning with Stability Guarantees

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorith...

متن کامل

Q-Value Based Particle Swarm Optimization for Reinforcement Neuro- Fuzzy System Design

This paper proposes a combination of particle swarm optimization (PSO) and Q-value based safe reinforcement learning scheme for neuro-fuzzy systems (NFS). The proposed Q-value based particle swarm optimization (QPSO) fulfills PSO-based NFS with reinforcement learning; that is, it provides PSO-based NFS an alternative to learn optimal control policies under environments where only weak reinforce...

متن کامل

Lyapunov-Constrained Action Sets for Reinforcement Learning

Lyapunov analysis is a standard approach to studying the stability of dynamical systems and to designing controllers. We propose to design the actions of a reinforcement learning (RL) agent to be descending on a Lyapunov function. For minimum cost-to-target problems, this has the theoretical benefit of guaranteeing that the agent will reach a goal state on every trial, regardless of the RL algo...

متن کامل

Reinforcement Learning Based PID Control of Wind Energy Conversion Systems

In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2002